Random shuffle of lists #10281

RaimoNiskanen · 2025-10-14T14:54:00Z

This PR adds functions rand:shuffle/1 and rand:shuffle_s/2 due to a discussion on ErlangForums: https://erlangforums.com/t/random-sort-should-be-included-in-the-lists-module/5125

There are 4 algorithms in the first commit. The suggested winner is the one remaining in the second commit.

Documentation and test cases are still missing...

github-actions · 2025-10-14T14:54:48Z

CT Test Results

2 files 97 suites 1h 7m 3s ⏱️
2 220 tests 2 168 ✅ 52 💤 0 ❌
2 606 runs 2 550 ✅ 56 💤 0 ❌

Results for commit b23c06e.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

bjorng

Nitpick: I don't know whether you intend to keep the first commit. In case you do, the last paragraph is missing a closing parenthesis, and the word "ridiculous" is misspelled.

lib/stdlib/src/rand.erl

Write a few shuffle algorithms for comparison. I have found no formal statement that it is bias free, but have tried to reason around it. The algorithm should be equivalent to generating more random decimals to decide the shuffle order for elements with the same random number. It should make no difference if the random decimals are generated always and ignored, or when needed. Speed: 1.2 s for 2^20 integers on my laptop. The classical textbook shuffle. Speed: 5 s for 2^20 integers on my laptop. Quite a beautiful algorithm since the `gb_tree` has all the functionality in itself. Speed: 5 s for 2^20 integers on my laptop. The same as the `gb_tree` above, but with a map. Uses the map key order instead of the general term order, which works just fine. Speed: 2 s for 2^20 integers on my laptop. Suggested by Richard A. O'Keefe on ErlangForums as "a random variant of Quicksort". Shall we name it Quickshuffle? Really fast. Uses random numbers efficiently by looking at individual bits for the random split. Has no overhead for tagging. Just creates intermediate lists as garbage. This generator appears to be equivalent with shuffle1, using a random number generator with 1 bit. Speed: 0.8 s for 2^20 integers on my laptop. The classical textbook shuffle. Our standard `array` module here outperforms map, probably because keys does not have to be stored, they are implicit. Speed: 2 s for 2^20 integers on my laptop. shuffle3 and shuffle4 have the theoretical limitation that when the length of the list approaches the generator size, it will take catastrophically much longer time to generate a random number that has not been used. There is no check for the list length being larger than the generator size in which case it will be impossible to generate unique random numbers for all list elements, and the algorithm will simply keep on failing forever. This is for now a theoretical problem since already for a list length with log half the generator size (e.g 2^28 with a generator size 2^56), my laptop runs out of memory with a VM of about 30 GB. shuffle1 and shuffle5 avoids that limitation. shuffle1 by recursing over the duplicates sublists so it is not affected much by fairly long lists of duplicates, shuffle5 by using only individual bits and ranges 2, 6, or 24. The classical Fisher-Yates algorithm in shuffle2 and shuffle6 does not have this limitation, but generating random numbers of unlimited length gets increasingly expensive, but should not be any problem for 2 or even 4 times the generator length, that is list lengths of well over 2^200, which is well over ridiculous.

RaimoNiskanen · 2025-10-16T14:18:15Z

New algorithm selected. "Quickshuffle"?

lib/stdlib/src/rand.erl

RaimoNiskanen · 2025-10-17T08:45:50Z

I wrote a longer explanation of the algorithm

* Use raw generator as bitstream. * Optimize 3 and 4 elements permutation by rejection sampling * Use `div` instead of `rem` for simpler reject-and-retry test.

RaimoNiskanen · 2025-10-20T08:45:37Z

Pushed some optimizations

lib/stdlib/src/rand.erl

RaimoNiskanen · 2025-10-24T13:58:05Z

I have tests (and previously documentation), and backed out some optimization attempts.

With the measurement function in the test case in place it turned out that
the leaf function for permuting a list of 4 was too clumsy and gave nothing. The permutation of 3 elements could be a bit simpler.

The measurement test function compares with the previously best function; decorate, sort, undecorate and shuffle duplicates. It also compares fast and slow PRNG:s.

Now this might be ready to merge...

RaimoNiskanen added this to the OTP-29.0 milestone Oct 14, 2025

RaimoNiskanen requested a review from bjorng October 14, 2025 14:54

RaimoNiskanen self-assigned this Oct 14, 2025

RaimoNiskanen added team:VM Assigned to OTP team VM team:PS Assigned to OTP team PS feature in progress priority:medium labels Oct 14, 2025

bjorng reviewed Oct 15, 2025

View reviewed changes

lib/stdlib/src/rand.erl Outdated Show resolved Hide resolved

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from 70efe49 to aba9094 Compare October 15, 2025 07:55

jhogberg reviewed Oct 15, 2025

View reviewed changes

lib/stdlib/src/rand.erl Outdated Show resolved Hide resolved

lib/stdlib/src/rand.erl Outdated Show resolved Hide resolved

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from aba9094 to 0280798 Compare October 15, 2025 10:33

RaimoNiskanen added 2 commits October 16, 2025 16:06

Select the algorithm for rand:shuffle_s/2

a536947

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from 0280798 to a536947 Compare October 16, 2025 14:17

bjorng reviewed Oct 16, 2025

View reviewed changes

lib/stdlib/src/rand.erl Outdated Show resolved Hide resolved

bjorng reviewed Oct 17, 2025

View reviewed changes

lib/stdlib/src/rand.erl Outdated Show resolved Hide resolved

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from fb2cb14 to 8e991bf Compare October 17, 2025 08:45

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from 8e991bf to c72c71c Compare October 17, 2025 09:51

Explain the shuffle algorithm

3aeae41

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from c72c71c to 3aeae41 Compare October 18, 2025 21:45

Optimize the shuffle algorithm

e372add

* Use raw generator as bitstream. * Optimize 3 and 4 elements permutation by rejection sampling * Use `div` instead of `rem` for simpler reject-and-retry test.

bjorng reviewed Oct 20, 2025

View reviewed changes

lib/stdlib/src/rand.erl Outdated Show resolved Hide resolved

Do a 4-way split instead of a 2-way

6e4d1e8

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from 3c0ceca to 6e4d1e8 Compare October 20, 2025 12:10

gproskurin reviewed Oct 20, 2025

View reviewed changes

lib/stdlib/src/rand.erl Outdated Show resolved Hide resolved

bjorng reviewed Oct 21, 2025

View reviewed changes

lib/stdlib/src/rand.erl Show resolved Hide resolved

Document the shuffle function

95b21d9

RaimoNiskanen force-pushed the raimo/stdlib/rand-shuffle branch from 5f73e08 to 95b21d9 Compare October 21, 2025 08:57

RaimoNiskanen added 3 commits October 24, 2025 11:09

Write tests for rand:shuffle

1299811

Remove unnecessary optimizations

5b13984

Measure compare with runner-up reference shuffle algorithm

b23c06e

RaimoNiskanen marked this pull request as ready for review October 24, 2025 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Random shuffle of lists #10281

Random shuffle of lists #10281

Uh oh!

RaimoNiskanen commented Oct 14, 2025

Uh oh!

github-actions bot commented Oct 14, 2025 •

edited

Loading

Uh oh!

bjorng left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RaimoNiskanen commented Oct 16, 2025

Uh oh!

Uh oh!

Uh oh!

RaimoNiskanen commented Oct 17, 2025

Uh oh!

RaimoNiskanen commented Oct 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RaimoNiskanen commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Random shuffle of lists #10281

Are you sure you want to change the base?

Random shuffle of lists #10281

Uh oh!

Conversation

RaimoNiskanen commented Oct 14, 2025

Uh oh!

github-actions bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CT Test Results

Artifacts

Uh oh!

bjorng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RaimoNiskanen commented Oct 16, 2025

Uh oh!

Uh oh!

Uh oh!

RaimoNiskanen commented Oct 17, 2025

Uh oh!

RaimoNiskanen commented Oct 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RaimoNiskanen commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions bot commented Oct 14, 2025 •

edited

Loading